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Abstract 

This paper considers convolution equations that arise from problems such 
as measurement error and non-parametric regression with errors in variables 
with independence conditions. The equations are examined in spaces of gen- 
eralized functions to account for possible singularities; this makes it possible 
to consider densities for arbitrary and not only absolutely continuous dis- 
tributions, and to operate with Fourier transforms for polynomially growing 
regression functions. Results are derived for identification and well-posedness 
in the topology of generalized functions for the deconvolution problem and 
for some regression models. Conditions for consistency of plug-in estimation 
for these models are derived. 
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1 Introduction 



The focus of this paper is on convolution equations arising in models with 
measurement error. Reviews of measurement error models are in Carroll 
and Stefanski (2006), Chen, Hong and Nekipelov (2011), Meister(2009). The 
convolution equations that are examined here also arise in other contexts; 
various models that go beyond measurement error models are enumerated in 
Zinde- Walsh (2012). This paper is devoted to the mathematical treatment 
of such equations. 

Start with the classical measurement error where the variable of interest 
x* is observed with error, u : 



Here the density of x* is the function of interest, denoted g; the observed 
z has density w; suppose that the measurement/contamination error u is 
independent of x* and has density /. Then the convolution equation 



holds. When densities exist, g*f denotes J g(z — u)f(u)du. In problems that 
often arise in image processing, epidemiology, medicine the error density / 
could be assumed known, e.g. the error is Gaussian noise. 

The assumption of known error distribution may not be realistic; if / is 



Z = X* + u. 



(1) 



g*f = W, 



(2) 
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not known additional conditions are needed to identify g. If another obser- 
vation, x, on x* is available: x = x* + u x conditions under which a unique 
solution exists for the density of interest, g, were given by Kotlyarski (1967) 
in the case of independence between x* and u x \ the approach there was to 
consider the joint characteristic function of the two measurements. 

The assumption of independence of the error in the second measurement 
may be too strong for some applications, for example in Cunha, Heckman 
and Schennach (2010) one measurement for the latent variable representing 
a skill of a child was constructed from test scores where one could plausibly 
assume independence for measurement error, but the extra measurements 
came from reports by teachers and parents where such an assumption could 
be unrealistic. Assume that for the error of the second measurement, u x , the 
conditional expectation is zero: E( ) = (but heteroscedasticity is not 

ruled out). Denote by Xk the k—th component of x = (xi, Xd) and by hk(x) 
the function Xkg(x), and assume existence of density weighted conditional 
moments u>2fc(^) = E (w(z)xk\z) . Then in addition to the equation (j2J) more 
convolution equations can be written: 

h k * f = w 2 k, k = 1, ...,d. 
Indeed (with integration here and everywhere in this paper over the whole 
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space) , 

E(w(z)x k \z) = E(w(z)x* k \z) = I (z k - u k )g(z - u)f(u)du. 



Therefore a system of convolution equations arises for the unknown function 
g and functions h k , where h k (x) = x k g(x), 



9*f = wu 
(hk) * f = w 2k , k = 1, ...d. (3) 

Another model that leads to a system of equations is nonparametric re- 
gression with Berkson error (see, e.g. Meister, 2009 for review). Consider a 
nonparametric regression model 

V = g{x)+u y , 

where x may be correlated with the error, u y , but where some instruments, 
z, are available such that 

z = x + u, (4) 

with z is independent of u (Berkson error) and E{u y \z) = 0. Denote the 
density of x by f x , density of z by f z , density of u by / u and correspondingly 
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that of —u by Then equation (TJJ gives 

fx = fz* f-u- (5) 

Additionally if expectation conditional on z exists, using independence be- 
tween z and u, 

w(z) = E (y\z) = E (g(x)\z) = g(z - u)f u {u)du; 



then 

9* fu = w. (6) 

The system of equations (I5f"6]) involves two unknown functions, f u and g, 
where usually the interest is in the regression function. 

The regression function could have instead of an observable argument, 
x, a mismeasured or latent argument, x*. The model could provide more 
equations if another measurement, x, on x* were available. Consider 

V = 9(x*)+u y ; (7) 
x = x* + u x \ (8) 
z = x* + u. (9) 

Here x, y, z are observed; assume that u is a Berkson type measurement er- 
ror independent of z with (unknown) density /, assume that u y , u x have zero 
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conditional (on z and the other errors) expectations. For this model Newey 
(2001) proposed to consider an additional equation assuming that the condi- 
tional moment E(g(x)x\z) exists. In the univariate case this gives a system 
of two equations with two unknown functions (as discussed in Schennach, 
2007, Zinde-Walsh, 2009). Define h(x) = xg{x), then 

g*f = w u (10) 
h*f = w 2 , 

with wi = E(y\z), W2 = E(xy\z) known. 

In the multivariate case for x = (xi, x^) G R d , w 2 k = E(xky\z), k = 
l,...,d , hk(x) = Xkg{x) this can be generalized to equations ([3]). If the 
density f x * of x* were of interest, equation f x * = f z * f- u could be added. 

If the independence between the mismeasured variable and measurement 
error holds conditionally then the equation (T5]) can be written for densities 
conditional on some x c , where if such densities exist this equation is 

J g{(z - u)\x c )f{u\x c )du = w(z\x c ). 

Then equation (T5]) is defined for functions in spaces where the dimension or 
the argument is augmented by the dimension of the conditioning variable, 
x c . 

A common way of providing solutions to (j2J) and other equations is to 
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consider them in some normed function spaces, e.g. of integrable functions 
such as L p or weighted L p spaces, e.g. Carrasco, Florens and Renault (2007). 
Solving the equations is often done by employing Fourier transforms. Since 
convolutions and Fourier transforms can be defined in different spaces, the 
question is which spaces are best suited for the problems. 

Devroye and Gyorfi (1985) view density from the perspective of L 1 space 
since the density (when it exists as a function) is absolutely integrable. How- 
ever, in various problems of interest density may not exist, as in cases of 
measurement error for individuals answering survey questions (say, about 
income or consumption) where the probability of truthful reporting is non- 
zero and a mass point can arise (Hu, 2008). Density functions in L 1 do not 
necessarily converge even if the corresponding distribution functions con- 
verge uniformly. The way to overcome both the non-existence of density 
and convergence problems is to consider density as a generalized derivative 
of the distribution function as proposed in Zinde- Walsh (2008); this is done 
by defining the distribution function as a functional on a suitable space of 
well-behaved differentiable functions so that with this definition the distribu- 
tion function inherits the good properties of the well-behaved functions and 
becomes differentiable (details in Zinde- Walsh, 2008, also see section 2.1 be- 
low); moreover, in the space of generalized functions the generalized densities 
converge if the distribution functions converge thus the problem of defining 
the density for a distribution is well-posed there. 

Working in spaces of generalized functions also extends the classes of 
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regression functions for which solutions can be obtained. In various applica- 
tions the object of interest may be represented by "sum of peaks" function, 
such as a sum of delta functions see, e.g. Klann, Kuhn, Lorenz, Maass and 
Thiele (2007) where applications in astrophysics and mass spectroscopy are 
discussed; functions with sparse support or support that includes isolated 
points can arise in various applications. If g represents a regression func- 
tion, applying Fourier transform to the convolution equations in spaces of 
functions may require severe restrictions on the function. For example, in 
spaces of integrable functions (such as L 1; L 2 ) linear and polynomial regres- 
sion functions as well as distribution functions in binary choice models would 
be excluded. Again, a natural extension is to consider spaces of generalized 
functions where Fourier transforms are defined for functions that can grow 
at polynomial rates as well as for objects with sparse support. Spaces of 
generalized functions were utilized by Klann et al (2007) for sum of peaks re- 
gression, by Zinde- Walsh (2008) for generalized density functions; by Schen- 
nach (2007) and Zinde- Walsh (2009) for the problem in errors in variables 
univariate regression model with possible polynomial growth in the function. 

This paper examines convolution equations in generalized function spaces. 
The interest here focuses on the equation (T5]) where only the function g is 
unknown (deconvolution) and the system of equations ([3]) with two unknown 
functions. The generalized functions spaces considered here are described in 
detail in Schwartz (1966) and Gel'fand and Shilov (1964). 

Much of the paper is devoted to a theoretical development of the problem 
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of solving these convolution equations in spaces of generalized functions: 
existence of a unique solution (identification) and continuity of the mapping 
from the known functions to the solution (well-posedness). 

The usual blueprint for deconvolution in function spaces works as follows. 
Assume: g * f = w holds, convolution exists, e.g. for functions in L\, say, 
densities. Fourier transform (Ft) is defined: for the function g, Ft(g) = 
J g{x) exp(ix T ()dx; this is a characteristic function if g is a density. Exchange 
formula applies: for Fourier transforms 7 = Ft(g);<f) = Ft(f);e = Ft(w) a 
convolution is transformed into product: 

g*f = u ^ 1 (f ) = e. 

Also, if additionally Fourier transform exists for hk(x) = Xkg{x),then Ft(hk) = 
— ig|-7(£); denote the derivative ^~7(£) by 7^. 

If w and / in equation (j2J) (thus also e and </>) are known and <ft ^ solve 
the algebraic equation : 7 = 4>~ l e, then apply the inverse Fourier transform, 
Ft -1 , to obtain g, 

9 = Ft' 1 (7) . 

When the functions that enter the equations are estimated based on avail- 
able data on the observables, the solutions will be stochastic and for estab- 
lishing consistency well-posedness of the solutions becomes crucial; Carrasco 
et al (2007), An and Hu (2012) discuss well-posedness that applies to similar 
problems in various normed spaces, mostly in spaces of integrable functions, 
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Lp. 

Thus for pursuing a similar approach in spaces of generalized functions 
the following questions are addressed here: 1. Under what assumptions 
do the convolution equations hold? 2. When is Fourier transform and its 
inverse defined? 3. When does the exchange formula hold (and when is a 
multiplicative product defined)? 4. When do transformed equations have a 
unique solution? 5. When is the problem well-posed, that is the solution 
continuously depends on the known generalized functions in the equation? 

Once the conditions for identification and well-posedness are established 
in the generalized functions space, the question of consistent plug-in estima- 
tion can be examined. Suppose that the known functions, such as the den- 
sities (characteristic functions) of the observables, conditional expectations 
of the observables (and their Fourier transforms) in the models considered 
are consistently estimated; the solutions to the convolution equations based 
on these estimated functions are now random generalized functions; do they 
converge in some stochastic sense to the true function in the topology of 
generalized functions? 

In order to answer this one needs to consider stochastic generalized func- 
tions which represent stochastic functionals on the spaces of well behaved 
(differentiable, etc.) functions. These are described in Gel'fand and Vilenkin 
(1964) and Koralov and Sinai (2007). The question of consistency requires 
providing conditions for stochastic convergence of Fourier transforms and 
inverse Fourier transforms, of derivatives and of products for random gen- 
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eralized functions - all the operations that are involved in solving for the 
unknown functions. 

Section 2 of this paper introduces the spaces of generalized functions 
considered here, gives conditions when convolutions are defined for gener- 
alized functions of interest, and when products of Fourier transforms are 
defined. Then the convolution theorem ("exchange formula") is provided 
making it possible to transform convolutions into products of Fourier trans- 
forms. Section 3 gives results on existence and uniqueness of solutions of 
the transformed equation or system of equations. These results give condi- 
tions for identification. General results on well-posedness of the solutions 
are proved here in Section 3 (some were previously given in the working 
paper Zinde- Walsh, 2009). Section 4 provides stochastic properties of gener- 
alized functions. Section 5 gives conditions for consistent (in the topology of 
generalized functions) deconvolution and also for consistent non-parametric 
estimation of a regression function in the model (171— T9|) . 

2 Convolution equations in generalized func- 
tions 

2.1 Spaces of generalized functions 

Many different spaces of generalized functions can be defined; each may be 
best suited to some particular class of problems. This paper focuses on well 
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known classical spaces of generalized functions discussed in the books by 
Schwartz (1966) and Gel'fand and Shilov (1964); reference is also made to 
related spaces as presented e.g. by Sobolev (1992). 

Define a space of well behaved functions, sometimes called test functions; 
denote a generic such space by G. The functions in G are defined on the 
real space, R d , but may take values in the complex space since we consider 
characteristic functions and Fourier transforms that can take complex values. 
Two widely used spaces of test functions are G = D and G = S. 

The space D is the linear topological space of infinitely differentiable 
functions each defined on a compact support, so that D C C°°(R d ), where 
C°°(R d ) is the space of all infinitely differentiable functions; to converge in D 
the sequence of functions should be supported on a common bounded set and 
converge uniformly itself as well as have uniformly converging derivatives of 
all orders. 

To define the space S first introduce some notation. For any vector of 
non- negative integers m = (mi, ...m^) and vector t G R d denote by t m the 



space S C C°°(R d ) of rapidly decreasing functions is then defined as: 



for any d— dimensional vectors of integers m, I, where / = (0, ...0) corresponds 
to the function itself, \t\ is the vector of absolute values of vector t, t — > oo 



product t™ 1 ..i™ d and by d m the differentiation operator 




. The 



S = G C°°(i? d ) : \t\ m \d l 



ip(t) \ = o(l) as t ->■ oo} , 
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coordinate-wise; thus the functions in S go to zero at infinity faster than 
any power as do their derivatives of any order. A sequence in S converges if 
in every bounded region each \t\ l \d k ip(t)\ converges uniformly. 

The generalized functions space G* is the dual space, the space of linear 
continuous functional on G. For any b G G* and any ip E G denote the value 
of the functional b applied to ip by (6, ip) . The topology is defined by weak 
convergence: a sequence b n G G* converges to b G G* if for any ip G G the 
sequence of the values of the functionals converges: (b n ,ip) — > (b,ip) . 

Sobolev (1992) gives a general definition (in 1.8) where he points out a 
subtle distinction between the functional and a generalized function. Any 
generalized function, b G G* , can be defined by an equivalence class {b n } of 
weakly converging sequences of test functions b n G G : 

b = < {b n } : b n G G, such that for any xjj G G, lim / b n (t)ip(t)dt — (b, if)) < oo > , 

where J -dt denotes the multivariate integral over R d , over-bar indicates com- 
plex conjugate for complex-valued functions and (b,ip) provides the value of 
the functional b G G* for ip G G. However, the same functional can be repre- 
sented by different generalized functions corresponding to different spaces G. 
For example, consider the 5— function. This is a linear continuous functional 
on the space of continuous functions as well as on D or S and provides 
(5,i/j) = ip(0); it can be represented as an equivalence class of S— convergent 
sequences of continuous functions as well as of functions from D or S. This 
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implies that a generalized function considered as a functional can sometimes 
be extended to a linear continuous functional on a wider space. 

Note that D C S and thus there is the inclusion of the linear dual spaces: 
S* C D*; convergence in S* of linear continuous functionals implies their 
convergence in D*, however, a sequence of elements of S* that converges in 
D* may not converge in the topology of S* (see the example in section 3.2). 

In the terminology of Schwartz (1966) generalized functions are sometimes 
called "distributions" and elements of S* "tempered distributions"; here we 
shall call them generalized functions indicating the specific space considered. 
In Sobolev (1992, p. 59) a diagram shows various chains of generalized func- 
tions spaces embedded in each other; these are spaces of functionals on spaces 
of continuously differentiable (of different orders) functions, continuously dif- 
ferentiable functions with compact support and Sobolev spaces. 

Any locally summable (integrable on any bounded set) function b(t) de- 
fines a generalized function b in D* by 



for complex-valued functions. Any locally summable function b(t) that ad- 




(11) 



on the space of real-valued test functions or by 




(12) 
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ditionally satisfies 



((l + t 2 )- 1 )™!^)!^ < oo 



(13) 



for some non-negative integer-valued vector m = (m 1; m^) with ((1 + t 2 ) -1 ) 
denoting the corresponding product Uf =1 ((1 + t 2 ) -1 )™ - ' similarly by ([12]) de- 
fines a generalized function 6 in S*; such generalized functions are called 
regular functions in the space of generalized functions. 

Generalized derivatives d m b, m = (mx,...md) exist for any generalized 
function b in D* and S* and are defined as functionals by the value for any 
function, ip, as (<9 m 6,^) = (-l) m (6,<9"» (here (-l) m = (-\) m ^+-+ m <i). The 
differentiation operator is continuous in S* and in D*. 

Thus, for example, a generalized density function, /, in the univariate 
case is defined as a generalized derivative of the regular distribution function, 
F(x), by providing for any real- valued function, ip 6 G, the value 



Thus defined generalized density is an element of D*, or of S*, moreover, it 
continuously depends on the distribution functions in the topology of either 
S* or D* by continuity of the differentiation operator. E.g. if F(x) = I(x > 
0) (the indicator function 1(6) = 1 if 9 is true, zero otherwise), by substituting 
into (fT4"l) we obtain the generalized derivative as the Dirac 5— function that 
provides (6, if)) = if)(0). 

Any generalized function can be represented as a generalized derivative 




(14) 
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of some order of a continuous function; to be an element of S* such a con- 
tinuos function cannot grow faster than a polynomial. More specifically, by 
Schwartz (1966, theorem VI, p. 239) any generalized function from b G S* can 
be represented as a generalized derivative of a continuous function: b = d l c for 
a continuous function c(x), which can be written as c(x) = (1 + x ) 2 c(x), 
where the continuous function c(x) is uniformly bounded on R d : \c\ < V. 
Consider some fixed vectors of non-negative integers, I and m and a bound 
V. Denote by S* m (V) the class of generalized functions, 6, that have such 
a representation for l,m and continuous c: |c(x)| < V; any bounded in the 
topology of S* set of generalized functions has the representation S( (V) 
(see Schwartz, 1966, (VII,5;5), p.246-247). The smallest such / is the order 
of integration that applied to a generalized function produces a continuous 
function and is related to the degree of singularity of the generalized function, 
while m characterizes its growth at infinity. 

The Fourier transform (Ft) is defined for functions in D, and more gen- 
erally in S; it is an isomorphism of the space S; the value of the Fourier 
transform for the test function ip G S also belongs to S and its value at 
any s G R d is Ft(ip)(s) = f ^){x)e %xT a dx (x T denotes transpose); in the dual 
spaces D* and S* the Fourier transform is given by (Ft(b),ip) = (b : Ft(ip)). 
The Fourier transform is an isomorphism of the space S*. 

For the analysis of this paper we mostly consider the functions of inter- 
est in the generalized functions space S*, because of the fact that Fourier 
transform represents an isomorphism which permits to apply it or its inverse 
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to any element, and the operator Ft and the inverse operator, Ft -1 , are 
continuous in S*. 

Assumption 1. The generalized functions defined by the statistical 
model are in the space S*. 

This assumption allows for any distribution function on R d and does not 
require existence of density functions, for regression functions this allows 
growth at infinity, but limits it by (fl3|) , thus binary choice or polynomially 
growing regression functions are included, but not functions of exponential 
growth. Note that exponentially growing functions are included in the space 
D*. 

2.2 Existence of convolutions; convolution pairs 

The convolution of generalized functions can be defined in different ways 
(see, e.g. Schwartz, 1966, p. 154 or Sobolev, 1992, p. 63; Gel'fand and Shilov, 
1964, v. I, p. 103- 104); it does not always have meaning and exists for specific 
pairs of mutual convolutors. 

Consider the following spaces of test functions and of generalized func- 
tions on R d : D,S,C°°,0 M ,D*,S*,E*,0* c , where C°° = C°°(R d ) is the 
space of infinitely differentiate functions on R d ; Om C C°° is the subspace 
of infinitely differentiable functions with every derivative growing no faster 
than a polynomial, E* is the subspace of generalized functions with com- 
pact support, and 0* c is the subspace of rapidly decreasing (faster than any 
polynomial) generalized functions (Schwartz, 1966, p. 244). Table 1 shows 
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Table 1: The convolution table 
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X 



pairs of spaces for elements of which convolution is defined (X indicates that 
convolution cannot be defined for some pairs of elements of the spaces); the 
table entries indicate to which space the element resulting from the convo- 
lution operation belongs. The table is an extended version of the one in 
the textbook by Kirillov and Gvishiani (1982, p. 102) and summarizes the 
well-established results in the literature. 

The convolution pairs in the table where convolution is defined all possess 
the hypocontinuity property (Schwartz, 1966, p. 167, p. 247-257). Hypocon- 
tinuity of a bilinear operation means that if one component of a pair is in a 
bounded set in G* and the other converges to zero in G*, the result of the 
bilinear operation converges to zero (Schwartz, 1966, pp. 72, 73). 

Convolution of a pair of arbitrary generalized functions is not always 
defined in S*. Bounded support or at least rapid decline at infinity is needed 
for a convolution to exist. For example, convolution of a constant function 
with another constant function on R 1 is not defined. Nevertheless there 
are pairs of subspaces of generalized functions beyond those in the Table for 
which convolution defines a generalized function; spaces where convolution is 
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defined can be combined. The convolution is a bilinear operation (Schwartz, 
1966, p. 157); convolution of a tensor product of generalized functions on two 
vector spaces, R dl , R d2 is the tensor product of the convolutions of functions 
in each space (Schwartz, 1966, p. 158). Moreover convolution of any number 
of generalized functions can be defined in D* as long as all except possibly 
one have compact supports and this operation is associative and commutative 
(Schwartz, 1966, p. 158); a variable shift or derivative of a convolution exists 
and is obtained by a shift or differentiation of any of the generalized functions 
entering the convolution (Schwartz, 1966, p. 160). 

Definition Call a pair of subspaces of generalized functions, A G G* and 
B C G* a convolution pair (A, B) if for any a G A, b G B convolution a * b 
is defined in G*; it is a hypocontinuous operation in the topology of G*. 

Note that if (A, B) is a convolution pair then (AU G,B U G) is also a 
convolution pair. 

All the pairs of spaces in the Table satisfy this definition. 

Assumption 2. The statistical model defines functions, g and f in G* 
such that g G A C G* and f G B C G* ; the subspaces (A,B) form a 
convolution pair. 

This assumption implies that (j2J) holds; it is often satisfied in statistical 
problems. Convolution of generalized density functions exists, thus ([I]) leads 
to ([2]) for all distributions even when the density functions do not exist in 
the ordinary sense. The finite sum of 5— functions considered by Klann et 
al (2007) is in E*, thus convolution with any element of D* (or S*) exists 
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in D*(S*). Schennach (2007) considered univariate errors in variables model 
with instrumental variables and a regression function, g, bounded by polyno- 
mials; these regression functions are in S*, convolution with any generalized 
density function from 0* c exists in S*. For regression functions g in subspaces 
of S* where growth is more restricted, convolution with less rapidly declining 
/ may exist. 

For a generalized function g denote by hk the product of g by the kth 
component of x G R d , Xk, for a regular function g{x) this is hk(x) = Xkg(x). 
In many cases when the convolution g * f is defined, the convolution hk * f 
is also defined. Indeed, this is so in all the examples above: if g has compact 
support, so does any h k ; if g G C°°, it is true of h k as well, etc. Thus some 
models accommodate not only (j2J), but also other equations, e.g. providing 

©• 

Assumption 3. The statistical model is such that in addition to Assump- 
tion 2, hk & A, k — 1, d. 

2.3 Fourier transforms, exchange formula and some 
special convolution pairs 

Next, consider the Fourier transforms and the exchange formula. 

For the generalized functions in equations ([3]) denote Fourier transforms 
as 7 = Ft(g), <fi = Ft(f) and E{ = Ft(uii). Recall that the Fourier transform 
always exists in S* and is a continuous isomorphism. 
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The classical case of the convolution pair (S*, O c ) is examined in Schwartz 
(1966). If (g,f) belong to (S*, O c ) the convolution exists and equation (J2]) 
transforms into the multiplicative equation 70 = e, where 7 G S*, G Om 
and e E S* (Schwartz, 1966, p. 281-282). The product between a generalized 
function in S* and a function from Om always exists in S*\ multiplication 
is a hypo continuous operation (Schwartz, 1966, p. 243-246). Thus there is a 
dichotomous relation between the classical convolution pair (S*, O c ) and the 
product pair of generalized functions spaces, (S*, Om) ■ Below we show that 
this dichotomy extends to other convolution pairs of spaces. 

Define a product pair of spaces as a pair (r, <&) of subspaces of S* such 
that for any 7 G T, G $ the product 70 defines an element e in S*; the 
operation of multiplication for (J 1 , $>) is hypocontinuous in the topology of 
S*. 

Theorem 1 If Assumptions 1 and 2 are satisfied then for any (g, f) G (A, B) 
the exchange formula applies. 

Proof. Consider the special sequences studied by Mikusinski (Antosik et 
al, 1973) and Hirata and Ogata (1958) that are defined for a generalized 
function b as b n = b * 5 n with 5 n representing the following delta- convergent 
sequence: for a number sequence: a n > and a n — > 0, the regular function 
S n (x) is non-negative with support in |x| < a n and J 6 n (x)dx = 1; the 
convolution with S n G E* is always defined. Moreover, b n — > b in S* : indeed, 
{b * 5 ni ip) = (b,ip * 5 n ) , but ip * 5 n is in S and converges there to if), so 
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(b * 5 n , ip) converges to (b, tp) . 

Since g, f belong to a convolution pair of spaces (A, B), we can enlarge the 
convolution pair (A, B) to include with every b G A or B the corresponding 
sequences b n . Denote the resulting pair of spaces by (^A, B^j . Show that this 
is also a convolution pair. Note that the convolution g n * f n is defined as 
support of 5 n is bounded. All that is needed is to show hypocontinuity; it 
follows from the fact that if a set T from A or B is bounded in S* so is the 
corresponding set T that contains every element b G T as well as all b n . If 
a sequence b m converges to zero in S* so does the corresponding b mn , and 
hypocontinuity extends to the enlarged convolution pair. 

To show the exchange formula we need to establish that for any (g, f) G 
(A, B) we get that Ft (g) Ft (/) =Ft(g*f). 

Start by the exchange formula of Hirata and Ogata for the sequences: 

limFt(^ n * f n ) = \imFt(g n ) \imFt(f n ). 
Consider first the left-hand side; by the continuity of Fourier transform in S* 

lim Ft(g n * f n ) = Ft(\im(g n * /„)), 

then by hypocontinuity of the convolution and because f n — >■ /, g n — >■ g we 
get that this is Ft (g * f) . 
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On the right hand side by continuity of Fourier transform and convergence 

lim Ft(g n ) hm Ft(f n ) = Ft (g) Ft (/) . 

Denote the space of Fourier transforms of elements from A (or B) by Ft (A) 
(correspondingly, Ft (B)). Then (Ft (A) , Ft (B)) is a product pair. Hypocon- 
tinuity of the product follows immediately from the hypocontinuity of the 
convolution and continuity of the Fourier transform and its inverse. ■ 

Corollary 1. Given a product pair of spaces (T, $) in S* the exchange 
formula applies to any 7, G (r, $) : 

Fr 1 ( 7 0) = Fr 1 (7) * Fr 1 (0) . 

The proof follows the proof of Theorem 1 above by replacing convolution 
with product and Fourier transform by the inverse Fourier transform. 

In examining problems such as deconvolution much of the literature fo- 
cuses on Fourier transforms, e.g. characteristic functions. The dichotomy 
between convolution pairs of spaces and product pairs of spaces in S* allows 
to switch between the two types of pairs. 

If is the characteristic function of a measurement or contamination 
error the condition G Om for the classical product pair it would require 
existence of all moments. It may be of interest to consider pairs where 
products of Fourier transforms of generalized functions exist for less smooth 
functions (e.g. with relaxed moments requirements on measurement error); 
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relaxing the smoothness of will require restricting the degree of singularity 
of 7. 

For a continuous function <p G define the space of test functions, 
denoted Gffi0G, that consists of functions that can be represented as ■0 1 +0'0 2 ) 
with tp i G G {G = D or S). Consider 7 G G* that can be extended to a 
continuous linear functional on G © 0G, denote the linear space generated by 
such 7 by G(4>)*. If G G then G(<p)* = G*. For any ip G G the value (7, 0-0) 
is defined and is a continuous functional with respect to ip; this defines the 
generalized function 07 : (07, -0) = (7,0-0). Then (G (0)* , G © 0G) is a 
product pair; the corresponding spaces of inverse Fourier transforms form a 
convolution pair. 

For example, the derivative, 6', of a univariate Dirac 5— function in G* 
can be multiplied by any continuous function, 0, that is differentiable at 0, 
since then (5'0,ip) = (5', (pip) with (5',<pip) = 0'(O)^(O) + 0(0)^(0). More 
generally, for a product between a continuous function and a generalized 
function to be defined, there is a trade-off between differentiability of the 
continuous function and the degree of singularity of the generalized function. 

Under the Assumptions 1-3 the convolution equations (|2|3|) lead to cor- 
responding equations for Fourier transforms: 

7 • = e (15) 
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or to the system of equations 



7-0 = £i 



(16) 



7fc • = i£2k, fc = 1, 



3 Solutions to the convolution equations: iden- 
tification and well-posedness 

3.1 Identification 

For identification the supports of the functions in the equations play an 
important role. 

Recall that for a continuous function ip(x) on R d support is defined as 
the set W =supp(?/>), such that 



Support of a continuous function is an open set. 

Since generalized functions can be considered as functionals on the space 
S support of a generalized function b e S* is defined as follows (Schwartz, 
1966, p. 28). Denote by (6,-0) the value of the functional b for ip e S. 
Consider open sets W with the property that for any ip e S : supp(^) = VF 
the value of the functional (6, ■?/>) = 0; then define the null set for b as the union 
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of all such sets: Q = UW. Then supp(fe) = R d \Q. Note that a generalized 
function has support in a closed set, for example, support of the 5— function 
is just one point 0. 

We start with deconvolution for ([2]). In the deconvolution problem /, or 
equivalently its Fourier transform, 0, is assumed to be given. Typically, is 
a characteristic function and thus is continuous and bounded. 

The convolution equation (j2j) uniquely identifies g for a known / if it 
can be shown that the corresponding equation (fT5l) has meaning and can be 
uniquely solved for 7. 

Next a useful Lemma is proved. This Lemma shows that if a product 
between a generalized function 7 and a continuous function is defined: 
e = 70, then division of e by is uniquely defined on supp(0) in D*. 

Lemma 1. Suppose that e = 70 in D*; is a continuous function. Then 
the generalized function $T x e is uniquely defined in D* on supp(cj)) . 

Proof. Denote W =supp(0) and consider D (W) the space of all the 
functions in D with supports restricted to belong to W\ the space D* (W) is 
the dual space for D (W) . Next, consider a covering of the open set W by 
bounded sets: W = UW U where each W v is an open bounded set. Similarly, 
consider D{W V ). Then any generalized function in D* can be restricted to 
the dual space D*(W V ). 

In D*(W U ) the generalized function 7 solves e = 70. Suppose that the 
solution is not unique and there is 7 7^ 7 such that 70 = e. Then (7 — 7) 
is defined and represents a zero element in D* (W u ) . A zero functional can 
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be extended from the space D (W u ) to a zero functional on the space of 
continuous functions on W u , C (0) (W„) C C (0) (R d ) . Then ((7 - 7) 0, -0) = 
for any G C<°> , but since is invertible continuous function on 
W v any continuous function ^ in (W v ) has the representation (fnp for 
some ^ G (W 7 ^) . This implies that (7 — 7, ip) is defined (equals zero) for 
any ip thus the functional 7 — 7 extends to (W u ) as a zero functional. 
This is only possible if 7 — 7 is a zero functional in D*(W U ). Then 7 — 7 is a 
zero generalized function in D*(W).M 

Recall that the difference between D* and S* is in the "tail behavior" only; 
on bounded sets the two spaces coincide. The following theorem provides 
deconvolution in S*. 

Theorem 2 Under Assumptions 1 and 2 assume that cf) = Ft(f) is a known 
continuous function; then for any W where supp((p) D W, the Fourier trans- 
form of g, 7, is uniquely defined on W\ if it is further known that suppfa) = 
W, then g is uniquely defined in S* . Uniqueness holds automatically if 
supp((p) = R d . 

Proof. By Theorem 1, (|T5|) holds in S*. Since supp(0) D W by Lemma 1 a 
unique solution 7 to (fT5l) exists in D* (W). Since 7 is the Fourier transform 
of g G S*, 7 G S* and defines an element in S*(W) C D* (W) uniquely. If 
support of 7 is restricted to W a priori, then the solution is unique, thus 
when supp(0) = R d , the generalized function 7 is defined uniquely in S*. An 
inverse Fourier transform exists in S* for any 7. It is then possible to recover 
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g by the inverse Fourier transform g = Ft 1 (7), thus g is uniquely defined 
whenever 7 is. ■ 

Generally by this Theorem identification in deconvolution holds when- 
ever supp(0) = R d . However, this excludes some error distributions such as 
the uniform or the triangular, where the characteristic function has isolated 
zeros. A more general result (Schwartz, 1966, pp. 123-125) establishes the 
possibility of division by <fi even when has zeros. In the one- dimensional 
case as long as (f) is infinitely differentiable, the zeros are isolated and there 
exists a finite order derivative that is non-zero at every zero point of </>, 
any generalized function can be divided by <fi. Thus since the uniform and 
triangular distributions have this property, deconvolution with these error 
distributions is also identified in S* . 

The next Theorem examines identification in the case when there are 
two unknown functions in the system ([3]) ; under Assumptions 1-3 it leads to 
ffTB"]) . The main conditions are continuous differentiability of one of 7, or <f), 
assumptions about support and knowledge of the value of the differentiable 
function at an interior point. If that function is a characteristic function, its 
value at is always 1. 

Theorem 3 Under Assumptions 1-3 for the system of equations ( JT61) if 
supp(cf)) Dsuppfa) = W, where W is a connected set in R d that includes 
as an interior point, and 

(a) if 7 is continuously differentiable in W, 7(0) = c, then 7 is uniquely 
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defined on W by 



7(s) = cexp / Y,t=i K k{t)dt k , 



(17) 



o 



with the uniquely defined continuous functions x k that solve 



>c k e x - ie 2k = 0, k = 1, 



or 

f&j zs continuously differentiable in W , 0(0) = 1, i/ien 7 zs uniquely 
defined on W by 



Then g = Ft^ 1 ^). If suppfa) = W, then g is uniquely defined. Uniqueness 
holds automatically if W = R d . If supp(<p) = W, then is also uniquely 
defined; and so is f = Ft _1 (0). 

Proof, (a) Consider the space of generalized functions D*(W) (defined in 
proof of Lemma 1). Since continuous 7 is non-zero on W by Lemma 1 
(reversing the roles of <fi and 7 there) in D* (W) the generalized function 



7 = ^l, 



(18) 



where 




with the uniquely defined on W continuous functions K k that solve 



eiH k - {{ei)' k - ie 2k ) = 0, k = 1, d. 
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is uniquely expressed as 7 _1 £i. Substitute this expression for into the 
differential equations in ( IT6|) . and denote the continuous functions 7/ c 7~ 1 by 
x fc(0 t° obtain equations 

Xk(£)e 1 -i£2k = 0,k = l,...,d (19) 

where the left-hand side is defined and equals zero in S*. 

We can show that the function x^ is uniquely determined in the class of 
continuous functions on W by the equation ( 1191) . Proof is by contradiction. 
Suppose that there are two distinct continuous functions on supp(7), x fcl ^ 
x fc2 that satisfy ( ([I9~]) ). Then x fcl (x) ^ x fe2 (x) for some x G supp(j). Without 
loss of generality assume that x is in the interior of W; by continuity x^i ^ 
Xfc2 everywhere for some closed convex U C W. Consider now D(U)*; we can 
write 

{£l{K kl - X fc2 ),-0) = 

for any ip G D(U). A generalized function that is zero for all ip G D(U) 
coincides with the ordinary zero function on U and is also zero for all ip G 
D (U), where D (U) denotes the space of continuous test functions on U. For 
the space of test functions Dq(U) multiplication by continuous (x&i— x^) ^ 
is an isomorphism. Then we can write 

= ([£i(x fea - x fc2 )] , ip) = (ei, (x w - x fc2 )V0 
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implying that e\ is denned and is a zero generalized function in D (U)'. If that 
were so S\ would be a zero generalized function in D{U)* since D{U) C D (U) 
but this is not possible since E\ = 70, which is non-zero by assumption. 

Next we show that 7 is then uniquely determined on W. Indeed, for 
any t, s Gsupp(7) with the kth coordinate denoted t k , write the continuous 
function 

7(s) = cexp/" J2k=i^k(t)dt k , 
Jo 

where integration is along any arc joining and s in W. This is the unique 
solution to 7(0) = c, 7~ 1 7 / fc = Xk (see, e.g., Schwartz, 1966, p. 61). Then in 
S* g = Ft" 1 (7) is uniquely defined. 

(b) In view of the result in Theorem 2 it is sufficient to show that <ft 
is uniquely determined on W. Consider the space of generalized functions 
D*(W). Since <ft is non-zero on W and continuously differentiate, then 
by differentiating the first equation in (|T6|) . substituting from the second 
equation and multiplying by in D*(W) (where the product exists as 
shown in Lemma 1) we get that the generalized function 

£i0 _ Vfc - (( £ i)l - i£ 2k) 

equals zero in the sense of generalized functions, in D*(W). Note that by 
assumption E\ cannot be zero on W and both E\ and 62k are zero outside of 
W. Define k k = then k k is continuous on supp(7) and is a regular 
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function that satisfies the equation 



e x k k - ((£ 1 ) k -ie 2 k) = 0. 



(20) 



We can show that the function x^ is uniquely determined in the class of 
continuous functions on W by the equation (120]) . Proof is identical to the 
proof in part (a) for equation ( (TT9I) ) . Next we show that is then uniquely 
determined on W. Indeed, the continuous function 



uniquely determined on W, so is on W where it coincides with 0. 

By Theorem 2, then 7 is then uniquely defined on W. Then in S* g = 
Ft _1 (7) is uniquely defined. ■ 

This Theorem extends the identification results of Schennach (2007) and 
Zinde-Walsh(2009) to the multivariate case and to the case when 7 rather 
than is continuously differentiable (part (a) of this Theorem), and extends 
the identification result of Cunha et al (2011) by showing that in the model 
with several measurements considered there identification additionally holds 
with the requirement of continuous differentiability of replacing the re- 
quirement that 7 be continuously differentiable (part (b)). 




is the unique solution to 0(0) = 1, fc = x fc ; then since x fe (= 0' fc x ) is 
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3.2 Well-posedness of the deconvolution in S* 

Well-posedness requires that a unique solution to the problem exist and that 
this solution be continuous in some "reasonable topology" (Hadamard, 1923). 
Here most of the results consider the topology of generalized functions which 
is weaker than, say, the uniform or L\ norm for corresponding subspaces, thus 
well-posedness may hold in this topology when it does not hold in the usual 
norms. But well-posedness may not always obtain even in this weak topol- 
ogy. An example is provided that involves deconvolution with a supersmooth 
distribution. 

Consider a convolution pair of spaces (A, B) in S*, denote by C the space 
of elements in S* represented by convolutions of generalized functions from 
(A, B) . Well-posedness of the deconvolution would require that for the given 
/ G B and any sequence of w n G C that converges to some w G C in the 
topology of S* : w n — > w, the sequence of g n G A such that g n * f — w n 
would converge to g G A, with g * f = w. By Theorem 1 this convergence 
can be restated for the product pair of spaces (T, $) where the product pair 
is the image of the convolution pair (A, B) under Fourier transform; denote 
by n G S* the space of products of elements from the pair (T, $). Then well- 
posedness can be restated in terms of the Fourier transforms: the problem 
is well-posed in S* if for any e n — >■ e where e n ,e G n the corresponding 
sequence 7 n = 4>~ 1 e n converges to 7 = 4>~ l e. 

If _1 G Om then by hypocontinuity of the product, for any sequence 
e n — e that converges to zero in S* the corresponding sequence 7„ — 7 also 
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converges to zero and well-posedness obtains without any restrictions on 7. 
This result could also extend to infinitely differentiable with some zeros 
and applies in the univariate deconvolution to error distributions such as the 
uniform and triangular (see Schwartz, pp. 123-125). 

Below we provide cases of well-posed deconvolution where less restrictive 
differentiability conditions are imposed on 0; this would require additional 
conditions on the 7's. The nature of the conditions is to ensure that the 
product pair is defined for the Fourier transforms; for a continuous this 
requires a trade-off between the degree of singularity of 7 (and correspond- 
ingly, e) and the differentiability of 0, these trade-offs can occur locally as in 
the example where 7 is the derivative of a 6— function and is continuously 
differentiable at 0, but to streamline the proofs we consider global restrictions 
in the product pairs. 

First, suppose that both and 7 are continuos functions, then e is con- 
tinuous as well. Let T be a subspace of all continuous functions on R d such 
that all 7 G V belong in a bounded set in S*\ then for some rh we have 
T = SQ m (V) (implying ( (1 + 1 2 ) ) \*y\ < V < 00). Then for any bounded 
continuous the product e = 70 G S , Q m (V). 

Lemma 2. Suppose that 7,7 n G Sq^V), is a bounded continuous 
function; e n = 7 n 0, £ = £0 = 70,0 < n < 00 and _1 is a continuous 
function that satisfies / T73j) . Then if e n — > £0 as n — > 00 in S* we get that 
In = 4>~ le n converges to 7 = _1 £o- 

Proof. First note that for any number £ > for any ip G S there exists 
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some bounded set A e R d such that 



R d \A 



e n (t)r\t)^(t)dt\ < £. 



Indeed, | J Rd ^ A e n {t)<jT l {t)^{t)dt\ < 



< v snp \ ( (i + t 2 ) l ) m m\ [ ({i+t 2 y l Y m \r\t)\dt. 



Since by (TT3T) J^ d ( (1 + t 2 ) ) \<f>~ is bounded, say, by some V$ and 

for ip e S" as t — )■ oo the value | ( (1 + 1 2 ) J converges to zero, the set 



A can be selected such that 



su P |((i+t 2 ) 1 ) m m\<^- i v- s 



Consider now the value of the functional {e n (j)~ l — e<fT . Since the se- 
quence of continuous bounded functions e n — e converges to zero in S* it 
converges to zero point-wise and uniformly on bounded A. Then for any £ 
and A corresponding to £ = ~£ we can find N such that 



sup \e n - £ \ 

A 



(/T 1 (t) ip{t)dt 



for any n > N. It follows that | (0 1 e n — e, if) | < £. 
Thus |7„-7| ^ in S*.B 
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In the measurement error problem the function e is a characteristic func- 
tion; if all e n in the equation (|T5|) are characteristic functions or at least some 
continuous and uniformly bounded functions then they satisfy the conditions 
of Lemma 2 with fa = 0. 

The somewhat unusual requirement that the sequence that converges be 
bounded in S* is a requirement that is associated with the weak topology 
of the space of generalized functions. In normed spaces convergence implies 
eventual boundedness in norm of the converging sequence, but this is not the 
case for generalized functions that do not exclude singularities and polyno- 
mial growth. The boundedness requirement uniformly limits the degree of 
singularity and the rate of divergence at infinity of the generalized functions 
in the bounded set. It is however less restrictive than many assumptions in 
normed spaces such as boundedness or integrability. 

The Lemma demonstrates that unlike the behavior in function spaces 
where the deconvolution is usually ill-posed, in the weaker topology of the 
space S* deconvolution is well-posed when the error distribution is such that 
the condition (fl3l is satisfied. Usually well-posedness in deconvolution is 
examined in terms of ordinary smooth and supersmooth distributions. When 
the distribution is ordinary smooth the condition (fl3~|) holds; it is not satisfied 
for supersmooth distributions. 

An and Hu (20111) demonstrate that well-posedness holds automatically 
in a measurement error problem provided there is a mass at zero; this happens 
if there is a reporting error but with some non-zero probability some values 
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are truthfully reported. Indeed if there is a mass at zero the error distribution 
is a mixture with the delta-function that has Fourier transform equal to 1, 
then (f> is separated from zero and condition ffT3"j) for holds. 

A further extension is possible to deconvolution where the generalized 
function 7 is not necessarily a continuous function and can be singular. Then 
additional differentiability conditions are needed to ensure that 7 and <fi are 
in a product pair. Let Y = Sl m (V) with I = (Ix,...,^ ); be a bounded 
continuous function that has continuous derivatives of any order < + 
then a product e = 70 exists in S* for any 7 6 T. 

Theorem 4 Assume^, r ) n £ S* m (V), and<p is a continuous regular function 
that has continuous derivatives d li for all subvectors U of I; e = 70, e n = 7 n 
forn = 1,2,...; ande n —e — > in S*. If(f> and alld l% (0 _1 ) for any subvector 
li of each I satisfy < T73|) i/ien £ n _1 exzsfo m S* and Ft~ l (e n (f)~ l ) — » g in S*. 

Proof. Denote by c n and c the continuous functions for which e n and e can 
be defined via the operator d l . Consider the functional (dT e n — _1 £, t/>) = 
(<p~ l d l (c n — c) , -0) ; it can be extended to the functional 

( Cn - c ,(-i)i'i^(0-V)) = (-i) 1 " E ^(/^^(^-c,^^ 1 )^), 

where (Zi, Z2) is the corresponding integer coefficient arising from differen- 
tiation of the product. All the functionals (c n — c,d ll (4>~ l )d l2 ip) that enter 
into the linear combinations on the right-hand side satisfy the conditions of 
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Lemma 2 and thus converge to zero. Therefore, 7 n — 7 = (e n — e)<f) 1 — > in 
S*. 

By continuity in S* of the inverse Fourier transform the limit 

Fr 1 ^- 1 ) ^ g 

in S* follows. ■ 

If the measurement error model holds conditionally on some <i c -dimensional 
x c assume that the all the distribution functions are defined on R d x R dc as 
F(-, x c ). For any generalized function b on R dx x R dc the partial Fourier trans- 
form Ft\ Xc is defined as follows: for i/j(x, x c ) G S with x G R dx , x c G R dc define 
Ft\ Xe U>)(s,x e ) G S by Je ia:T Xx,x c )dx, then (Ft\ Xc b,^) = (b,Ft\ Xc {^)) ■ 
Consider the following possibilities: 1. the generalized functions 7, (f),e de- 
note now the partial Fourier transforms or the distribution functions; 2. 
instead of distribution functions consider generalized density functions and 
denote by 7, 0, e the corresponding partial Fourier transforms; 3. assume con- 
tinuous differentiability of the probability distribution functions with respect 
to x c and consider conditional probability or, correspondingly, conditional 
generalized densities; additional conditions will be needed to make sure that 
dividing by the density of x c is possible in S*. In all these cases ( TL5l) holds. 
To avoid imposing extra constraints here examine case 2. For deconvolution 
is assumed known and often does not vary with x c . The conditions of 
Lemma 2 may not apply as the functions e, e n may no longer be continuous, 
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however, since a probability distribution is a bounded monotone function, 
e is in some set S* d (V) and thus theorem 4 requires only that all e n be- 
long to S* d (V), that f[TS|) is satisfied for _1 and that be differentiate in 
components of x. 

The condition ffl3l) for _1 is necessary for well-posedness: the example 
below shows that even in the weak topology of generalized functions well- 
posedness does not hold if it is not satisfied e.g. for deconvolution with 
supersmooth distributions. 

Example. Consider the function <p{x) = e~ x2 , x G R, then _1 does not 
satisfy (T73|]- Then there exists a sequence of functions r y n and a 7 in S* such 
that 7 n — > 70 in S* , but 7 n does not converge to 7 in S*. 

Define the function b n (x) = (n — -,n+ -) ; then b n G S*. Then 

define 7 n = 7 + & n for some fixed 7 6 5*. Define e n = 7 n and e = 70; since 
G S, the products are always defined. Consider the difference e n — e; it 
equals I (n — -,n + -) . In S* this sequence converges to zero. On the other 
hand, (7 n — 7, ip) for any ip G S equals 




e x ip{x)dx. 



Select ip G S given by ip(x) = exp(— \x\); then (-y n — 7,-0) = 

rn+2/n rn+l/n 

/ e x2 ijj(x)dx > / e x2 ~ x dx 

Jn—2/n Jn—l/n 

> _L e -(^)+i n -i) 2 

- In 
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Thus for this i/j the values of the functional, {p/ n — 7, VO diverge as n — > 00, 
and so 7 ra — 7 does not converge to zero in S* M 

Note that since any ip e D has bounded support in the example conver- 
gence of 7 n to 7 holds in D*. 

3.3 Well-posedness in S* of the solution to the system 



of equations (1161) 



Recall ( TT3l) and define a subclass of functions in S*, $(m, V), where b 6 
$(m, V) if b satisfies the condition 

((l + t 2 )- 1 ) m \Kt)\<V <oo (21) 

The requirement that b belong to $(m, V) implies that ( fl3l) applies uni- 
formly in this class and also defines a bounded set in S*. The next Theorem 
considers well-posedness of the problem in ([3]) (and correspondingly, ffl6l) ) 
under the identification conditions of Theorem 3(b). The conditions of The- 
orem 3(a) can be similarly considered. 

Theorem 5 Suppose that (7 n , <f) n ),n = 1, 2, ... and (7, 4>) belong to a product 
pair (r, $) . Additionally, let the conditions of Theorem 3(b) apply to each 
pair (7 n , </>„). Suppose that e\ n — e\ — > andeikn—tik in S*; the functions 
0, (j) n as well as , 0" 1 restricted to W all belong to some $(m, V) Then if 

(a)(r,$) = (s*,o M ), 



or 
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(b) all H kn = (0J' fe 0~\ and x k = {4>)' k <p~ l are such that (x kn ,e ln ), 
(xfe,£i) belong to some product pair, 

the products s n 4>~ 1 exist in S* and Ft~ 1 (ei n (j)~ 1 ) — >■ g in S*. 

Proof, (a) We have that 0, n G Om- It follows that {<p)' k , {4> n )' k G Cm- Also 
(0)1 ' (0n)l e $(m', V), where m' = m + t, with t a vector of ones. From 
Theorem 3 it follows that for every n the functions 7 n and <p n are uniquely 
identified on W. From now on we consider all functions and function spaces 
restricted to W, even when W does not coincide with R d , but keep the same 
notation. The functions belong also to $(m, V) where m = m + m'. Without 
loss of generality assume that each K k is also in the same $(m, V), and so all 
x kn , K k are in a bounded set in S*. Since from condition (a) it follows that 
Xkn = (^n)!^ 1 £ Om, products are defined and from equations Ei n x kn — 
- ^2fen) = and convergence of e in to we get that £i n x fe „ - eix k 
converges to zero in S*. For functions in Om products with any elements 
from S* exist, thus £i n x fcn — E\x kn exists; moreover (e ln — ej) x fcn converges 
to zero in S* by the hypocontinuity property (Schwartz, p. 246). It follows 
that £i(xfc n — Xfe) converges to zero in S*. Since S\ is supported on W and 
( x fcn — x fe) G by continuity of the functional £i it follows that x fcn — x fc 
converges to zero on W. It then follows that <p n — <fi — > in S* as well as 
pointwise and uniformly on bounded sets in W, the product is in a 

bounded set in S*, thus 0" 1 — 0" 1 = (0 — n ) converges to zero in 

S*. 

Consider Sin^" 1 — £i0 _1 = £i„(0^ 1 — + (£i n — £i)0 _1 ; this difference 
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converges to zero in S*, thus 7 n converges to 7 in S* and since the Fourier 
transform is continuously invertible in S*, g n = Ft~ 1 {ei n (j)~ ) — > g in S*. 

(b) Modify the proof of (a) by stating existence of products based on the 
assumption in (b) rather than assuming the specific product pair (S*,Om) ■ 
■ 

Thus well-posedness obtains for the solution to the errors in variables 
regression with Berkson type instruments that provides (j3J) as long as _1 and 
all the 0" 1 are all in the same class $(m, V)); e.g. a set of the characteristic 
functions would satisfy this if they are not supersmooth and the growth of 
all 0" 1 is bounded by the same order polynomial. With the condition (a) 
which does not restrict 7 G S* the Theorem will hold if the supports of 
4>, 4> n are bounded or if these were Fourier transforms of functions with some 
singularity points (e.g. from distributions with mass points), when would 
include a constant. The condition (b) would require imposing restrictions on 
degree of singularity of 7 and further differentiability conditions on <fr and 
n of the type imposed in Theorem 4 to ensure that all the products exist. 
Even if in the model is a characteristic function, <p n need not necessarily 
be characteristic functions, but just continuous functions that satisfy the 
conditions. This result did not require any restrictions on 7 and thus on the 
regression function in the model beyond belonging to S*. Under Theorem 
3(a) a similar result holds after imposing the conditions on 7, 7 n rather than 
(f), (f) n to ensure that generalized functions x kn = (7 n ) fc 7^ x belong to Om and 
to some $(m, V). 
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For example, if as in Cunha et al (2010) both 7 and are characteristic 
functions and thus continuous and bounded, products exist and Theorem 
5(b) applies, as long as the conditions of Theorem 3 are satisfied and the 
condition (121]) is satisfied for the inverses of the functions in the sequence, 
that is e.g. under Theorem 3(a) for 7" 1 , or under 3(b) for so that the 
corresponding distributions are not supersmooth and the polynomial lower 
bound holds uniformly, well-posedness holds. 

The implications that well-posedness has for estimation are two-fold. One 
is that unless well-posedness holds, that is the inverse mapping from the class 
of the known functions into the class of identified solutions is continuous, the 
solutions corresponding to the consistently estimated known functions will 
not in general provide consistent estimators for the solutions. The other is 
that in a well-posed problem consistent estimation of the known functions 
automatically gives rise to consistency of plug-in estimators of solutions; the 
next section examines consistent estimation of the deconvolution and the 
solution to the system ([31) in the space S*. 

4 Random generalized functions and stochas- 
tic convergence 

This section examines stochastic convergence of the solutions to the decon- 
volution equation (j2J) and to flS}, equivalently to (fT5"j) and (fTtjj) . If some 
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consistent estimators are available for either the function w (wi and w 2 k) 
or, equivalently, for the Fourier transform, e {e\ and E2k), stochastic conver- 
gence of the solutions provides consistency results for plug-in estimators of 
g as long as well-posedness holds. Since generalized functions in any space 
G* are represented as linear continuous functionals on G results for random 
functionals are applicable here, the specific feature is that for any generalized 
function b E G* one needs to consider the collection of random functionals 
indexed by the functions from the space G. 

4.1 Random generalized functions 

Following Gel'fand and Vilenkin (1964) define random generalized functions 
as random linear continuous functionals on the space of test functions (see e.g. 
Koralov and Sinai, 2007 who consider specifically S*— ch.17). In particular, 
any random generalized function b on G is represented by a collection of 
(complex-valued) random variables on a common probability space that are 
indexed by ip G G, denoted (b,ip), such that 

(a) (6, (aiV>i + a 2^ 2 )) = a i(b, ipi) + a 2 (b, ^ 2 ) a.s.; 

(b) if ip kn — > ty k in S as n — > oo, k — 1, 2.., m, then vectors 

{Mm)-Mmn)) ((6,Vi)-(£,<0 

where — >a denotes convergence in distribution. 

As shown in Gel'fand and Vilenkin, equivalently, there exists a probabil- 
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ity measure on G* such that for any set ip 1 ,...ip m G G the random vectors 
((6,^1), ■■■(b, 4> m )) have the same distribution as for some random functional 
b, ((&, V'm))- An example is a generalized Gaussian process 6, so de- 
fined if for any ...,ip m the joint distribution of (6, , (6, ■0 m ) is Gaus- 
sian. A generalized Gaussian process is uniquely determined by its mean 
functional, /i b : VO = E(b,i/)), and the covariance bilinear functional, 

B b (M j ) = E((M> i ) (6,^-))- 

Gelfand, Vilenkin (v.4, p. 260) give the covariance functional of the 
generalized derivative, W, of the Wiener process as 

POO 

B w ,{^ 2 ) = / ij^^dt 
Jo 

where the overbar represents complex conjugation; for real- valued processes 
it is not needed. The Fourier transform of a Gaussian random process b is 
also a Gaussian random process with covariance functional 

fl« (6) (^.) = E((b,Ft(^)) faFttyj))). 
So for Ft(W) the covariance functional is 

poo 

Jo 

The mean functional is zero for W and Ft(W). 

Gelfand and Vilenkin (1964) provide definitions and results for general- 
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ized random functions in G* = D*, rather than S*. One can similarly define 
random generalized functions on other spaces of test functions, not necessar- 
ily infinitely differentiable, e.g. on of k times continuously differentiable 
functions with compact support, leading to space Dl of random linear con- 
tinuous functionals on D^. 

4.2 Stochastic convergence of random generalized func- 
tions 

A random sequence b n of elements of a space of generalized functions G* con- 
verges to zero in probability: b n — > p in G*, or almost surely: b n — > . s . in G* 
if for any finite set ...ip v G G the random vectors {{b n ^ipi)^ ...(b n ,ip v )) — > p 
(0,...,0) or correspondingly, Pr(((6 n , ^i), -(b n , ip v )) (0,...,0)) = 1. 

Similarly, convergence in distribution of generalized random processes 
b n =^d b is defined by the convergence of all multivariate distributions for ran- 
dom vectors ((b n , ...(&„, ip v )) ^ d ((6, VO) for any set Vi, ip v e 
G. 

Remark 1. (a) If b n - b -^ p in S* then Ft(b n ) - Ft(b) ^ p in S* 
and Fi _1 (& n ) — Ft~ x {b) — > p in S*. Indeed, for any set tp 1) ...ip v G S 

((Ft(b n ) - Ft(b)M, (Ft(b n ) - Ft(b)M) 
= ((b n -b,Ft^ 1 )),...,(b n -b,Ft(^ v ))) 

and since the set Ft ),..., Ft ijp v ) G S then {{b n — b, Ftijp^), (b n — 
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b,Ft(ip v ))) ^ p 0. Similarly for Ft' 1 . 

(b) If ji G G and b n — b — > p in G* , then b n fi — bfi — > p in G*. This 
follows similarly from the fact that flip G G for ip G G. 

(c) Parts (a) and (b) of this Remark also hold with — > a . s . replacing — > p 
and with convergence to zero replaced by convergence in distribution to a limit 
generalized random process. 

5 Consistent estimation of solutions to stochas- 
tic equations 

Suppose that the known functions, w or u>i,u>2fc (equivalently, e or £i,£2fc) 
are consistently estimated in S*. In the models discussed here these func- 
tions are density functions or conditional mean functions; typically these are 
ordinary functions for which commonly used nonparametric estimators are 
shown to be consistent pointwise or in some norm (uniform or Li, say) under 
some sufficient conditions. If the functions and the estimators can be repre- 
sented as elements in S* then convergence in many common norms implies 
convergence in the weaker topology of S*, so that such consistent estimators 
are consistent in S* and the following discussion of consistency of the solu- 
tions to convolution equations considered here applies. However, consistency 
in S* applies more widely than the usual consistency results. For exam- 
ple, as shown in Zinde- Walsh, 2008, kernel estimators of density converge as 
stochastic generalized functions even when they diverge pointwise (e.g. at 
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mass points, or for fractal measures). 

For the equations considered here, when identification and well-posedness 
holds consistent estimation of the known functions can provide consistent 
plug-in estimators for the solutions as long as the estimators all be in the 
classes over which well-posedness obtains. 

The fact that well-posedness implies consistency of some plug-in estima- 
tors holds generally. Denote (generically) the known functions by wi, ...wm, 
their consistent in topology of S* estimators by Wi, denote the solutions 
by gi,...,g K and the plug-in estimators from solving the equations using 
w i by (jj- If the estimators, Wi, are in classes where well-posedness holds, 
then for any set of ...ip v ) from S and any neighborhood N vK (0) of zero 
in real or complex vK— dimensional Euclidean space there is a neighbor- 
hood of zero, N vM (0) in the corresponding vM— dimensional space such 
that event E w = j((u>i, V'i) , (wi,ip v ) , (^2,^1) , (wm,^ v ))' e N vM (0) j 
by well-posedness implies the event 

Eg = {((&, Vl) , (9l^ v ) , (92: ^l) , (9K, ^ v ))' G N vK (0)} . 

Consistency of Wi as n — > oo means that for large enough n probability of E w 
can be arbitrarily close to 1, thus probability of E g is as close or closer to 1. 
Thus the condition for consistent plug-in estimation is that the estimators are 
in the classes of generalized functions that provide well-posedness of solutions 
with probability approaching 1. 
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5.1 Consistency of deconvolution: solving ( TT5I ) 

The well-posedness condition directly applies to the measurement error de- 
convolution problem where <fi is known and the conditions apply only to 
estimators of observables. The requirement of boundedness of the sequence 
of the estimators in the space S* would follow if the norms of the functions 
were bounded in probability; L p are the spaces typically used in the litera- 
ture (e.g. Fan, 1991; Carrasco and Florens, 2010). But here more generally 
consistency of plug-in deconvolution holds for any probability distribution; 
the only important restriction is on 0" 1 to satisfy (fT5|) ; that is for the mea- 
surement error not to be supersmooth. 

Moreover, the functions in fl2]) and the corresponding ffl5|) need not be 
generalized densities and characteristic functions; the conditions for consis- 
tent estimation follow from those for well-posedness and are essentially that 
the two functions belong to a convolution pair (equivalently, Fourier trans- 
forms to a product pair) and that the known function be continuous and </> _1 
satisfy ffT3l) . 

For example, consider in the univariate case a generalized density func- 
tion, w. One can use empirical distribution functions to estimate the corre- 
sponding distribution function, then the estimator of generalized density by 
generalized derivatives provides as an estimator the sum of delta-functions: 

n 

w n = i $( x j)i where (S(xj),ip) = ip(xj). Then the corresponding Fourier 

n 

transform e n = Ft(w n ) is given by a continuous function e n (s) = - ^2 e ls Xj . 
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Deconvolution using the deconvolution kernel (as e.g. in An and Hu, 2012) 
is applied when the density of the mismeasured variable is assumed to exist 
and belong in some L p , the deconvolution kernel incorporates spectral cut-off 
by employing estimators of the form e n = e n (s)I(\s\ < T n ), where T n goes to 
oo as n — > oo at some rate; the indicator function allows for smoothing in 
the inverse Fourier transform. 

By the results here consistency in S* is ensured for both estimators. In- 
deed since empirical distribution converges uniformly in probability to the 
distribution function, it thus converges in S*, then the generalized derivatives 
also converge in S* and by continuity of the Fourier transform estimators e n 
and for T n — > oo also e n converge to e in probability. All these functions are 
continuous and bounded and only (fl3l) for _1 is needed for consistency of 
the deconvolution solution in S* and the spectral cut-off is not required. The 
following remark summarizes this result. 

Remark 2. If the functions in equation ([2]) are all generalized density 
functions, the known characteristic function <p is such that _1 satisfies (113j) . 

=1 is a random sample from the distribution with the generalized density 
w, the deconvolution estimator 



is consistent in S* : for any finite set ip 1 ,...ip v G S the random vector 
((9n ~ 9, ^i), (Sn ~ 9, ^ V )) T ^ P 0. 




1 " 
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Regularization with suitable spectral cut-off in the case of supersmooth er- 
ror distributions typically provides a sequence of estimators that will converge 
in norm, albeit slowly for supersmooth distributions (Fan, 1991); convergence 
to the limit in the normed space implies convergence in the topology of S*. 
However, if is not supersmooth the class of generalized functions where 
consistent estimation holds does not include all probability distributions. In- 
deed spectral cut-off estimation provides generalized densities that are a limit 
of inverse Fourier transforms of truncated generalized functions. Schwartz 
(1964, pp. 271-273) gives a characterization of any function in S* with Fourier 
transform that has compact support (in a cube \zk\ < C,k = l,...d) based 
on Wiener-Paley theorem. Such a function is a continuous function g that 
can be extended to a entire analytic function G of a complex argument and 
is of exponential type < 2C, meaning 

Um MG W | < 2C 

|z|-K»|,Zi| + ... + \Zd\ 

Thus as long as g is such a function or a limit of such functions it can 
be expressed via the regularized solution. As in Schwartz the subspace of 
all functions of exponential type (for any finite C) can also be considered. 
However, if g does not belong to this subspace regularized solutions may not 
converge to it even in S*. 
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5.2 Consistency of estimated solutions to the system 
of equations ([2]) and ( TTfil ) 

Establishing consistency for the system of equations is more complicated 
since consistency requires not only conditions on estimators of known func- 
tions of the observables, e., but also verification that the resulting estimators 
of (f) satisfy the conditions of well-posedness in Theorem 5. 

The next Theorem gives conditions for a consistent plug-in nonparametric 
estimator for the model (171 — 1~9|) that leads to the system and thus (flEI) 
with continuously differentiable <fi. 

Semiparametric generalized method of moments estimation of this model 
for a class of regression functions that includes functions in Li(R d ) was 
proposed by Wang and Hsiao (2010); semiparametric estimation was also 
discussed for somewhat different classes of univariate parametric regression 
functions in Schennach (2007) and Zinde- Walsh (2009). Semiparametric esti- 
mation of polynomial regression is in Hausman, Newey, Ichimura and Powell 
(1991). 

Start by formally stating the assumptions. 

Assumption 4 (model). In the model (171— T9|) in R d the moments of 
the errors E(u y \z,u,u x ) = 0; E(u x \z,v) = 0; z is independent of u; g is a 
generalized function in S*. 

This assumption implies that if the function g is a regular locally summable 
function it cannot grow at infinity at a rate faster than some polynomial rate. 
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This assumption does not exclude the possibility that g has some singular- 
ities, for example, it could be a sum of 5— functions or a mix of a regular 
function and some peaks represented by ^-functions. 

Denote by f z the density of z. Recall that / denoted the generalized 
density of u and 7 = Ft (g) ; = Ft (/) , e. = Ft(w.). 

Assumption 5 (support). 

(a) supp(j) is a connected set that includes as an interior point; 

(b) supp{4>) = R d . 

(c) supp(f z ) = R d . 

Assumption 5(a) would not be satisfied by a polynomial regression func- 
tion when support of 7 consists of one point 0; semiparametric estimation 
for that case was provided in Hausman et al (1991). Support assumptions 
(b) and (c) are standard; they could be relaxed. 

Recall that here is a characteristic function; 0(0) = 1. 

Assumption 6 (generalized functions). 

(a) The function f z is continuous and belongs to Om- 

(b) The continuous functions w. belong to a bounded set in S*, 5^(7); 
also, 4>' k for k = 1, ...,d, as well as _1 and f^ 1 all belong to S^ m (V). 

(c) The regression function g is absolutely integrable. 

(c'J The regression function g can be represented as a sum g = g^\ + g g , 
where g^\ is absolutely integrable and g g is such that the Fourier transform 
of g g in S*, ^ g = Ft(g g ), is singular and thus has support set A g that is 
compact and of zero Lebesgue measure; A g is a proper subset of suppij). 
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The generalized function 7ff is such that there exists a deterministic sequence 
of regular functions, (l g ) n that converges to r ) g in S* and such that support 
of [lg) n is in a compact set A gn ; there exists ( > such that | ( 7ff ) | > 2£ 
and for 7li = Ft{g L1 ) on A g \j L1 - (7JJ > Co- 
Assumptions 6(a) and 6(b) imply that w. can be divided by f z and any 
generalized function can be divided by 0. Requiring that be in Om is 
sufficient to ensure that the model leads to equations ([3D in S*. More detailed 
conditions similar to those employed in Theorem 2 would allow relaxing the 
infinite differentiability assumption 6(a). In particular, if 7 is a characteristic 
function Assumption 6(a) for is not needed. 

Continuity of w. in 6(b) would follow by properties of convolution if either 
g were continuously differentiable, or / were continuous. 

In (b) using the same bound on growth, m, and the same V for all the 
functions simplifies exposition without loss of generality. The bounds could 
be liberal but are assumed known in the construction of estimators. The con- 
straint on the 0" 1 restricts the measurement error from being supersmooth 
and the constraint on /~ does not permit fast decline to zero at infinity 
for the density of conditioning z; these would be automatically satisfied if 
supports were bounded. 

Assumption (c) implies that 7 and therefore S\ are continuous functions. 
Indeed an integrable function has a continuous Fourier transform, 7, and 
E\ = 70 is continuous since is a characteristic function and thus continuous. 
Assumption 6(c') holds more generally, e.g. if g is a sum of an integrable 
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function and a polynomial or a sin or cos function; in such cases 7 is a sum of 
a continuous function (that is separated from zero on bounded sets within its 
support) with singular functions such as the 5— function, shifted 5— function 
and its derivatives. The support conditions in the assumption imply that 
the open support set of the continuous j L1 contains A g . The existence of a 
function sequence converging in S* to 7 ff with the support properties stated 
follows from the general properties of generalized functions; the only sub- 
stantive condition there is that the values of the approximating functions 
(^s)n ^ e se P ara t e d from zero and from the values of j L1 . For example, if g 
included an additive constant, then 7^ is a 5— function, and (t s ) could be 
selected as a sequence of step functions. These approximating functions need 
not be specified, only their existence is required for the proof. 

The next assumption is on the stochastic properties of the data generating 
process, the sampling and on the kernel and bandwidth. 

Assumption 7. Moments of order q > d + 1 of jy^y^ y, (j^sy^ xy con- 
ditional on z are bounded; {%i,Vi, Zi}™ =1 is a random sample from {x,y,z} ; 
the kernel K is the indicator function of the unit sphere; the bandwidth h is 
such that h — > and satisfies n 9-1 h — > 00. 

Note that Assumption 6(b) implies boundedness of first conditional mo- 
ments (functions w.) of (1+ ^ 2)m y, {l+ l z2)m xy. 

Define the estimators (denoting by (xfc), the ith observation on the kth 
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component of vector x) 

n 

£ ViK ( V) 

1=1 

n 

w 2 fc(^) n = L - L ^ i ,k = l,...d. 

i=i 

By assumption 6(b) the functions , 1+ ^ m Wx(z), ^ 1+ ^ m W2fc(z) are bounded 
in absolute value by V; by 6(a) the density of z, f z (z), exists and is continu- 
ous, then that for z in any closed sphere S(x, r) Gsupp(/ 2 ) the essinf f z (z) > 
0. Together with Assumption 7 this is sufficient to ensure that estimators 

(l-|_ z 2)m ^l(^)? (l + 2 2-)m ^2fc(^)) With ?i).(z) n 

computed as fl22|) . with and /i that satisfy assumption 7 converge in proba- 
bility uniformly over any compact set (Devroye, 1978). If m = the moment 
condition is a usual condition made for the Nadaraya-Watson estimator of 
w.; here essentially just the growth of the conditional moment functions has 
to be restricted; this provides estimators that converge in the topology of S*. 
The bound V is assumed known here; more restrictive assumptions, including 
in particular differentiability of w. could provide a uniform over compact sets 
rate of convergence for e.g. asymptotically optimal estimators (e.g. Stone, 
1982); then V that defines w. n could grow with sample size. 

Uniform convergence in probability implies that the estimators converge 
in probability in the topology of D*. If the functions have compact support 
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this implies convergence in S* since on compact support it coincides with 
convergence in D*, but on unbounded support growth at infinity needs to 
be controlled for convergence in S*. Thus, for any generic estimator w n 
represented by a regular function, w(z) n , of a regular generalized function 
w — w (z) in S* m (V) define a corresponding estimator w n = w (z) n by 
setting it to w(z) n if |tu(z) n | < V(l + z 2 ) m , and V(l + z 2 ) m otherwise. 

Lemma 3. Suppose that the estimator w n converges to w in probability 
uniformly on bounded sets; then w n converges to w in probability in S* and 
the corresponding Fourier transform e n = Ft(w n ) converges in probability in 
S* to e = Ft(w). 

Proof. Consider a set ip 1 , ip v e S. For any ( > find a compact set A 
such that for ?p representing any of ip v , Fttyi), Ft(tp v ) 

[ v(i+t 2 r\m\dt<(. 

JR d ^.A 

Consider \w n — w\ < \w n — w\I{zE A)+\w n — w n \ I (z e A) + \w n — w\ I [z G R d 
Then for any C tl there exists N such that Pr ^ (w n — w, ifj > (-^j < 

Pr ^sup \w n - w\ | J 4){t)dt\ > C^j +Pr A: \w n — w n \ \ J 4>{t)dt\ > ( 1 - 2(^j < 



since the value 



W„, - W, 



outside A is bonded with certainty by 2( and 
the two probabilities on bounded A can be bounded because of uniform con- 
vergence in probability of the estimators. Recall that for Fourier transforms 
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(Ft(w n — w),xjj) = (w n — w, Ft(tp))) . Thus convergence in probability in S* 
of the estimators of the functions w, and also of the Fourier transforms e is 
established. ■ 

Denote Ft(w. n ) by e. n , note that by construction these random gener- 
alized functions are infinitely differentiable since by the assumption on the 
kernel K the support of w. n is bounded and thus the Fourier transform is 
differentiable. The Theorem below establishes consistency of plug-in estima- 
tors. 

Theorem 6 If Assumptions 4-7 are satisfied, then 
(i) if supp(^) is bounded the plug-in estimator 



7, 



rs d 

eX P(- / E £i~n(0((£lnW)l - i£2kn(t))dt k 
JO fc=l 



Sir 



is such that it exists with probability going to 1 and Ft 1 (j n ) — g — > p in S*; 



-l. 



(ii) generally the estimator 7 n = e ln with 



~4> 1 = ~<t>n ( s ) 



s d 

<>X P(- / E ern(*)(( g ln(*))i - i£2kn(t))dt k 
fc=l 



and 



<>X P< - / E ern(*)(( g ln^))Ib - i£2kn{t))dt k 
fe=l 



< V (1 + s 2 ) 



2 \m 



V (1 + S 2 ) 



2 \m 
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otherwise, is such that it exists with probability approaching 1 and Ft 1 (7„) — 
g — > p in S*. 

Proof. The proof consists of the following steps. 

Step 1 considers a compact set inside the support of 7 and shows that on 
such a set consistency in S* follows. 

Step 2 examines the continuous part of 7, 7 L1 , and convergence is prob- 
ability of estimators defined on compact sets that exclude a set containing 
the support of the singular part. 

Combining the results of Step 1 and 2 concludes the proof of (i) . 

Step 3 considers the case of unbounded support with n defined in (ii). 
On any compact set with this estimator replacing n the results in Steps 
1 and 2 hold so consistency on a bounded set obtains. Consistency on the 
unbounded support is shown in topology of S* by selecting for any set of 

...t]) v from S the corresponding compact set defined in Lemma 3 to bound 
all the functional in probability outside of the compact set to prove (ii). 

Next, the details are provided. 

Step 1. First, £i„; this is a sequence of continuous functions that converge 
in probability in S* by Lemma 3. Fourier transforms of functions in L\ are 
uniformly continuous on compact sets. If Assumption 6(c) holds convergence 
is to the continuous function S\. Then e ln converge in probability pointwise 
and uniformly on the compact set A. Then for any < ( 1 , ( 2 < 1 we can find 
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Ni = N(A, Ci, C 2 ) sucn that for n > Ni 



Pr(sup|e ln -£i| > Ci) <C 2 - 

A 



Consider A that satisfies inf led > 0. Set Ci < inf led, then 

A A 



Pr(inf |e ln | < Ci) < C 2 ; and (23) 

Pr(sup|£^i > cr 1 ) < C 2 - 



A 



Under 6(c') for a singular 7^ there exists a deterministic sequence of 
regular functions, (t 9 ) that converges to 7 9 in S* and such that support of 
(t 9 ) is in a compact set A 9n that is a proper subset of support of 7; it can 
be selected such that for some sequence ( gn — > the Lebesgue measure of 
A gn , A (A 9n ) < C 9 „; A 9ni C A 9 „ 2 for m > n 2 and on any A 9n for some fixed C 
we get inf |(7 9 ) | > Co- For the corresponding sequence (e g ) n = (7^) <p on 

Agn 

A 9n positive lower bounds on modulus exist and are no less than Co m f \ ( t > \- 

A gn 

Under 6(c') consider the sequence of deterministic functions (e g ) = 
(lg) n 4> and the difference (sli) h = £ i~{ £ g) n ■ This is a sequence of piece-wise 
deterministic continuous functions that converges in S* to the continuous 



function e±Li = 7li0- Then on a compact set A for Ci < min inf £lli , Co m f 

\ A Ag„ 

we can find the corresponding so that for e ln = e ln — (e g ) + (s g ) n 



Pr(sup \e ln - (e g ) n - e 1L1 \ < Ci) < C 2 - 

A 
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Then 



Pr(inf \e ln \ < Ci) < Pr(mf \e X n ~ (e 9 )J < Ci) < C 2 ! 

and (I2"5j) also holds. 

Then with probability approaching 1 on compact A the continuous ran- 
dom functions k^ n = Sin((^in)' k — i£2kn)i k = 1, d are in a bounded set in 
S*, and thus n (s) = exp(— J Q S f = i>(k(t)dtk) is also in a bounded set, and 
thus with probability approaching 1 satisfies the condition for well-posedness 
of Theorem 5. Then the estimators 7 n = 4> n E\ are consistent for 7 in S* on 
the compact set A where inf IeiliI > 0. 

A 

Since j L1 is a continuous function its support is an open set. By Assump- 
tion 6(c') the compact support of 7 9 is contained inside the open support of 
7 L1 . If inf |£ili| > Ci > this concludes the proof of (i). 

supp(7ti) 

Step 2. Otherwise consider the set A = UA gn ; and the open set Q =supp(7)\A; 
then £i n converge to S\ that is continuous on that set. Of course under As- 
sumption 6(c) Q = supp(j). Generally fl is a union of open connected sets 
and we can proceed by considering each component in Q. Without loss of 
generality such a component can be assumed to be a connected open set 
containing zero as an interior point, since if it does not contain zero by a 
shift (which is a continuous operation in S*) of an arbitrary interior point 
into zero this can be attained. 

All the proofs that follow apply to an open set that will be denoted Q 
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that is connected and contains zero as an interior point. 

Consider a compact set A C O where infled > 0. It follows from the 

A 

proof in step 1 under continuity of S\ that Pr (sup — ef 1 ] > (^j = 

Pr (sup | (i^(e 1 - e ln ) > (^J 

< Pr^supKe^ex-e^)! >C?) 

< Pr (sup \i^\ > Ci~\sup |^r!(^l -£ln)| > Cl) 

+ Pr (sup \e^\ < Cr\ sup \e^{ei - hn)\ > (tj 

< Pr (sup \e^\ > Cr 1 ^) + Pr (sup | £l - e ln \ > C^j 

< 2C 2 . 

Theorem 3(b) implies that _1 (s) = exp(— J* t=i x k(t)dtk), where K k 
is the unique continuous function that solves E\K — ((si)' k — ie2k) = in S* 
on f2. By Assumptions 6(a-b) for 0, exists in S^and g = Ft~ l {ei(p~ 1 ) 

in S*. By Assumption 5a ex = 70 is non-zero on supp(7). 

Consider the estimator function k kn = £]"n((^in)l — i^2kn) on compact 
AcSl; the function e± is continuous there and it follows that (si)' k — isik = 
E\K k is continuous. 

The sequences of random functions ((s\ n ) k — is2kn) converge in probability 
uniformly on A in S* to the continuous function ((si)' k — is k 2)- Define B = 
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sup |(£i)' fe — ie k2 \ ■ For < C4 find N 2 such that 

A,k 

Pr(sup |((ei„)' fc - ie 2kn ) - ((ei)' fc - ^2)! > C 4 ) < C 2 for n > N 2- 

A 

Bound Pr(sup \k kn - x k \ > ( 5 ) < 



A 



Pr(sup|e ln 1 | \ ((e ln )' k - ie 2n - (ei)' k + ie 2 \+sup - e^l \(ei)' k - ie k2 \ > C 5 ) < 

A "A 



Pr(sup \e lr ]\ > Ci V ) + ^sup |((ei n )' fc - ie 2kn - (e^ + ie k2 \ > 
+ Pr(sup|5 ln 1 -er 1 | > Q/B). 

A 

If C5 = min {CiB, C4/C1} the probability as n > max {N 1: N 2 } is less than 
4C 2 - 

Then Pr(sup \f° J2Li*kn(t)dt k - J° Et=iMt)dt k \ > C 6 ) < 

A 

Pr(sup f Y.k=i\ k kn(t)-H k (t)\dt k > C 6 ) < Pr (sup \k kn (t) - x k (t)\ > C 6 /M A ) J , 

A Jo V A / 

where /z(A) is the measure of the compact set A. For ( 6 = M^OCs then the 
probability is less than A( 2 . 

Consider now on A the function <f) n (s) = exp(— J Q S ^2 k=1 >c kn (t)dt k ). De- 



fine B = supV(l + s 2 ) m ; then supU _1 (s)| < B. 

A A 
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Then Pr(sup <t>* - <f x > ( 7 ) < 



Pr(sup 

A 



Ei=l^kn(t)dt k - / J2 d k=1 H k (t)dt k 



>\n(l + B- 1 ( 7 )), 



and is smaller than A( 2 for ( 7 = ln(l + B 1 C, & ). 

Since for the continuous functions sup \4>~ l ei n — < 

A 

.Bsup \ei — e ln \ + sup \ei | sup U" 1 - 

A A A 

by similar derivations 



Pr ( sup 

A 



> Cs < 5C 2 



ifC 8 <min{ J B ^x, C 7 ( SU P kil) 1 - 



If Q is bounded then by Assumption 5 sup \ <p 1 (s)| is uniformly bounded 



n 



and since n converges to 1 in probability uniformly on any compact set 



inside f2 then also 4> n is bounded away from zero and then n £ ln — 
converges in probability to zero on fl This concludes the proof of (i). 

Step 3. If supp(7) is unbounded consider <j) n defined in (ii). From the 
proof in step 1 it follows that for large enough N the estimator n = n on 
any compact A with arbitrarily high probability and then £i n n converges 
to 7 on A in probability in S*. 

Consider an arbitrary set ...,if) v G S and the corresponding compact 
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set A defined by Lemma 3 and show that Pr( (7 n — 7, ip) > () goes to zero. 

Since £ ln — E\ converges to zero in probability in S* by Lemma 3 and 
since on Q this difference is a continuous function, then also \ei n (t) — £i(t)\ 
converges to zero in probability in S* on f2. Thus 



/ v(i+t 2 y 



\sm(t) -£i(t)| ijj(t) 



dt 



converges in probability to zero. Then since 



< sup 

A 



L 



dt 



+ / v (1 + t 2 ) m \ £l (t)\ m 

Jn\A 



dt 



+ / i/(i+t 2 ) m iMi)-£iWi m 

Jn\A 



dt, 



it follows that Pr( 



(7n-7,^) > < 



Pr(sup 



A 



-1 

m dt ) C) 



+ Pr / V(l + t 2 ) m \ei(t)\ m dt) > C) 
\Jn\A J 

: / V (l+t 2 ) m |£i„(t)-£i(t)| m dt) > C) 
'n\A / 



Here as shown in the Step 2 the first probability converges to zero, the 
second converges to zero by assumption 6 on e±, definition of the set A 
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and Lemma 3, and the third by convergence of e\ n . Then 7 n converges in 
probability to 7 = in S*. Taking inverse Fourier transforms in S* 

concludes the proof. ■ 

The Theorem provides consistency of plug-in estimators for solutions to 
the system of equations f|T6|) and consequently (j3J) in a fairly general set-up. 
Nevertheless some assumptions can be further relaxed. Of course, establish- 
ing results for a compact support of the Fourier transforms is much easier 
and thus using spectral cut-off can be advantageous especially when high 
frequency components of the regression function may be commesurate with 
the magnitude of the error components. 

Computation of the estimators requires applying Fourier transforms and 
inverse Fourier transforms. This can be accomplished with numerical algo- 
rithms. However, it is possible to simplify the estimated e. Consider instead 

n 

of the estimator in ( 1221) , w\ (z) n , an estimator computed as Yl 

i=l 

n 

with the weight a>i = Yl K ^ 3 '^ ) replacing the z— dependent weight in 

Wi (z) n and similar estimators for W2k (z) n . Then the corresponding Fourier 

transform £i n (s) can be expressed as a~ x yie %s Zj sincl ) , where by def- 

i=i \ w / 

inition sinc(x) = sir ^ x , and similar expressions for E2kn- Further computation 
for the estimators would have to be done numerically. 
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6 Conclusion 



This paper was devoted to treating a single convolution equation and a spe- 
cific system of convolution equations; many statistical models with various 
independence conditions give rise to such equations; measurement error is em- 
phasized here, but equations of this type are also applicable in other models, 
such as factor models and panel data models; many examples are presented 
in Zinde- Walsh (2012). The results of this paper indicate conditions for 
identification and well-posedness when casting these equations in terms of 
generalized functions; the generalized functions approach enlarges the area 
of applicability of the models. 
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